10 research outputs found

    Higher level techniques for the artistic rendering of images and video

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Magic Layouts: Structural Prior for Component Detection in User Interface Designs

    No full text
    We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts. Our core contribution is to extend existing detectors to exploit a learned structural prior for UI designs, enabling robust detection of UI components; buttons, text boxes and similar. Specifically we learn a prior over mobile UI layouts, encoding common spatial co-occurrence relationships between different UI components. Conditioning region proposals using this prior leads to performance gains on UI layout parsing for both hand-drawn UIs and app screenshots, which we demonstrate within the context an interactive application for rapidly acquiring digital prototypes of user experience (UX) designs.Comment: CVPR 202

    Robust Synthesis of Adversarial Visual Examples Using a Deep Image Prior

    No full text
    We present a novel method for generating robust adversarialimage examples building upon the recent ‘deep imageprior’ (DIP) that exploits convolutional network architecturesto enforce plausible texture in image synthesis. Adversarialimages are commonly generated by perturbing imagesto introduce high frequency noise that induces image misclassification,but that is fragile to subsequent digital manipulationof the image. We show that using DIP to reconstructan image under adversarial constraint induces perturbationsthat are more robust to affine deformation, whilst remainingvisually imperceptible. Furthermore we show that our DIPapproach can also be adapted to produce local adversarialpatches (‘adversarial stickers’). We demonstrate robust adversarialexamples over a broad gamut of images and objectclasses drawn from the ImageNet dataset

    OSCAR-Net: Object-centric Scene Graph Attention for Image Attribution

    No full text
    Images tell powerful stories but cannot always be trusted. Matching images back to trusted sources (attribution) enables users to make a more informed judgment of the images they encounter online. We propose a robust image hashing algorithm to perform such matching. Our hashis sensitive to manipulation of subtle, salient visual details that can substantially change the story told by an image. Yet the hash is invariant to benign transformations (changes in quality, codecs, sizes, shapes, etc.) experienced by images during online redistribution. Our key contribution is OSCAR-Net1 (Object-centric Scene Graph Attention for Image Attribution Network); a robust image hashing model inspired by recent successes of Transformers in the visual domain. OSCAR-Net constructs a scene graph representation that attends to fine-grained changes of every object’s visual appearance and their spatial relationships. The network is trained via contrastive learning on a dataset of original and manipulated images yielding a state of the art image hash for content fingerprinting that scales to millions of image

    Dataset for "Higher Level Techniques for the Artistic Rendering of Images and Video"

    No full text
    This dataset contains images and videos demonstrating several Artistic Rendering algorithms presented in the thesis "Higher Level Techniques for the Artistic Rendering of Images and Video". In contrast to techniques that consider only a small image region local to each rendering stroke, or only the current and preceding frame in a video, these algorithms use a higher spatio-temporal level of analysis to broaden the range of potential rendering styles, enhance temporal coherence in animations, and improve the aesthetic quality of renderings. The images in the dataset are high resolution versions of paintings presented in the thesis (chapters 3 and 4). The videos exhibit various renderings or source footage for the ‘ballet’, ‘basketball’, ‘bounce’, ‘contraption’, ‘cricket’, ‘metronome’, ‘poohbear’, ‘sheep’, ‘spheres’, ‘stairs’, ‘volley’, ‘wand’, and ‘wave’ video sequences (chapters 6 to 8)

    Scene Designer: a Unified Model for Scene Search and Synthesis from Sketch

    No full text
    Scene Designer is a novel method for searching and generating images using free-hand sketches of scene compositions; i.e. drawings that describe both the appearance and relative positions of objects. Our core contribution is a single unified model to learn both a cross-modal search embedding for matching sketched compositions to images, and an object embedding for layout synthesis. We show that a graph neural network (GNN) followed by Transformer under our novel contrastive learning setting is required to allow learning correlations between object type, appearance and arrangement, driving a mask generation module thatsynthesizes coherent scene layouts, whilst also delivering state of the art sketch based visual search of scenes

    Deep Image Comparator: Learning to Visualize Editorial Change

    No full text
    We present a novel architecture for comparing a pair of images to identify image regions that have been subjected to editorial manipulation. We first describe a robust near-duplicate search, for matching a potentially manipulated image circulating online to an image within a trusted database of originals. We then describe a novel architecture for comparing that image pair, to localize regions that have been manipulated to differ from the retrieved original. The localization ignores discrepancies due to benign image transformations that commonly occur during online redistribution. These include artifacts due to noise and recom-pression degradation, as well as out-of-place transformations due to image padding, warping, and changes in size and shape. Robustness towards out-of-place transformations is achieved via the end-to-end training of a differen-tiable warping module within the comparator architecture. We demonstrate effective retrieval and comparison of benign transformed and manipulated images, over a dataset of millions of photographs

    COMPOSITIONAL SKETCH SEARCH

    No full text
    We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects. Sketch based image retrieval (SBIR) methods predominantly match queries containing a single, dominant object invariant to its position within an image. Our work exploits drawings as a concise and intuitive representation for specifying entire scene compositions. We train a convolutional neural network (CNN) to encode masked visual features from sketched objects, pooling these into a spatial descriptor encoding the spatial relationships and appearances of objects in the composition. Training the CNN backbone as a Siamese network under triplet loss yields a metric search embedding for measuring compositional similarity which may be efficiently leveraged for visual search by applying product quantization

    ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity

    No full text
    We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style. Representation learning is critical to visual search, where distance in the learned search embedding reflects image similarity. Learning an embedding that discriminates fine-grained variations in style is hard, due to the difficulty of defining and labelling style. ALADIN takes a weakly supervised approach to learning a representation for fine-grained style similarity of digital artworks, leveraging BAM-FG, a novel large-scale dataset of user generated content groupings gathered from the web. ALADIN sets a new state of the art accuracy for style-based visual search over both coarse labelled style data (BAM) and BAM-FG; a new 2.62 million image dataset of 310,000 fine-grained style groupings also contributed by this work
    corecore